The initial cluster number of the K-means clustering algorithm is randomly determined, a large number of redundant features are contained in the original datasets, which will lead to the decrease of clustering accuracy, and Cuckoo Search (CS) algorithm has the disadvantages of low convergence speed and weak local search. To address these issues, a K-means clustering algorithm combined with Dynamic CS Feature Selection (DCFSK) was proposed. Firstly, an adaptive step size factor was designed during the Levy flight phase to improve the search speed and accuracy of the CS algorithm. Then, to adjust the balance between global search and local search, and accelerate the convergence of the CS algorithm, the discovery probability was dynamically adjusted. An Improved Dynamic CS algorithm (IDCS) was constructed, and then a Dynamic CS-based Feature Selection algorithm (DCFS) was built. Secondly, to improve the calculation accuracy of the traditional Euclidean distance, a weighted Euclidean distance was designed to simultaneously consider the contribution of samples and features to distance calculation. To determine the selection scheme of the optimal number of clusters, the weighted intra-cluster and inter-cluster distances were constructed based on the improved weighted Euclidean distance. Finally, to overcome the defect that the objective function of the traditional K-means clustering only considers the distance within the clusters and does not consider the distance between the clusters, a objective function based on the contour coefficient of median was proposed. Thus, a K-means clustering algorithm based on the adaptive cuckoo optimization feature selection was designed. Experimental results show that, on ten benchmark test functions, IDCS achieves the best metrics. Compared to algorithms such as K-means and DBSCAN (Density-Based Spatial Clustering of Applications with Noise), DCFSK achieves the best clustering effects on six synthetic datasets and six UCI datasets.
Accurate classification of massive user text comment data has important economic and social benefits. Nowadays, in most text classification methods, text encoding method is used directly before various classifiers, while the prompt information contained in the label text is ignored. To address the above issues, a pre-training model based Text and Label Information Fusion Classification model based on RoBERTa (Robustly optimized BERT pretraining approach) was proposed, namely TLIFC-RoBERTa. Firstly, a RoBERTa pre-training model was used to obtain the word vector. Then, the Siamese network structure was used to train the text and label vectors respectively, and the label information was mapped to the text through interactive attention, so as to integrate the label information into the model. Finally, an adaptive fusion layer was set to closely fuse the text representation with the label representation for classification. Experimental results on Today Headlines and THUCNews datasets show that compared with mainstream deep learning models such as RA-Labelatt (replacing static word vectors in Label-based attention improved model with word vectors trained by RoBERTa-wwm) and LEMC-RoBERTa (RoBERTa combined with Label-Embedding-based Multi-scale Convolution for text classification), the accuracy of TLIFC-RoBERTa is the highest, and it achieves the best classification performance in user comment datasets.
Aiming at the problem that the current semantic segmentation algorithms are difficult to reach the balance between real-time reasoning and high-precision segmentation, a Squeezing and Refining Network (SRNet) was proposed to improve real-time performance of reasoning and accuracy of segmentation. Firstly, One-Dimensional (1D) dilated convolution and bottleneck-like structure unit were introduced into Squeezing and Refining (SR) unit, which greatly reduced the amount of calculation and the number of parameters of model. Secondly, the multi-scale Spatial Attention (SA) confusing module was introduced to make use of the spatial information of shallow layer features efficiently. Finally, the encoder was formed through stacking SR units, and two SA units were used to form the decoder. Simulation shows that SRNet obtains 68.3% Mean Intersection over Union (MIoU) on Cityscapes dataset with only 30 MB parameters and 8.8×109 FLoating-point Operation Per Second (FLOPS). Besides, the model reaches a forward reasoning speed of 12.6 Frames Per Second (FPS) with input pixel size of 512×1 024×3 on a single NVIDIA Titan RTX card. Experimental results imply that the designed lightweight model SRNet reaches a good balance between accurate segmentation and real-time reasoning, and is suitable for scenarios with limited computing power and power consumption.
The unique advantages of Named Data Networking (NDN) make it a candidate for the next generation of new internet architecture. Through the analysis of the communication principle of NDN and the comparison of it with the traditional Transmission Control Protocol/Internet Protocol (TCP/IP) architecture, the advantages of the new architecture were described. And on this basis, the key elements of this network architecture design were summarized and analyzed. In addition, in order to help researchers better understand this new network architecture, the successful applications of NDN after years of development were summed up. Following the mainstream technology, the support of NDN for cutting-edge blockchain technology was focused on. Based on this support, the research and development of the applications of NDN and blockchain technology were discussed and prospected.